Semantic Concept Detection Using Dense Codeword Motion
نویسندگان
چکیده
When detecting semantic concepts in video, much of the existing research in content-based classification uses keyframe information only. Particularly the combination between local features such as SIFT and the Bag of Words model is very popular with TRECVID participants. The few existing motion and spatiotemporal descriptors are computationally heavy and become impractical when applied on large datasets such as TRECVID. In this paper, we propose a way to efficiently combine positional motion obtained from optic flow in the keyframe with information given by the Dense SIFT Bag of Words feature. The features we propose work by spatially binning motion vectors belonging to the same codeword into separate histograms describing movement direction (left, right, vertical, zero, etc.). Classifiers are mapped using the homogeneous kernel map techinque for approximating the χ2 kernel and then trained efficiently using linear SVM. By using a simple linear fusion technique we can improve the Mean Average Precision of the Bag of Words DSIFT classifier on the TRECVID 2010 Semantic Indexing benchmark from 0.0924 to 0.0972, which is confirmed to be a statistically significant increase based on standardized TRECVID randomization tests.
منابع مشابه
Semantic Motion Concept Retrieval in Non-Static Background Utilizing Spatial-Temporal Visual Information
Motion concepts mean those concepts containing motion information such as racing car and dancing. In order to achieve high retrieval accuracy comparing with those static concepts such as car or person in semantic retrieval tasks, the temporal information has to be considered. Additionally, if a video sequence is captured by an amateur using a hand-held camera containing signi ̄cant camera motion...
متن کاملApproximability of Dense Instances of NEAREST CODEWORD Problem
We give a polynomial time approximation scheme (PTAS) for dense instances of the Nearest Codeword problem.
متن کاملHigh Dense Crowd Pattern and Anomaly Detection Using Statistical Model
Human crowd behavior analysis is a subject of great interest in research now days. Great advantage of investigating dense human crowds in places like mosques and temples to perform automatic surveillance for any unusual activity detection that might be a subject of interest and must be addressed on earliest to avoid accident. We present robust statistical skeleton for modeling a dense crowded s...
متن کاملCombining Motion Understanding and Keyframe Image Analysis for Broadcast Video Information Extraction
We describe a robust new approach to extract semantic concept information based on explicitly encoding static image appearance features together with motion information. For high-level semantic concept identification detection in broadcast video, we trained multi-modality classifiers which combine the traditional static image features and a new motion feature analysis method (MoSIFT). The exper...
متن کاملPooling in image representation: The visual codeword point of view
In this work, we propose BossaNova, a novel representation for contentbased concept detection in images and videos, which enriches the Bag-of-Words model. Relying on the quantization of highly discriminant local descriptors by a codebook, and the aggregation of those quantized descriptors into a single pooled feature vector, the Bag-of-Words model has emerged as the most promising approach for ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013